C:\Aztec33\Programs\Dir33\WorkingWithFiles.txt

The Aztec C65 DIR33 Project for Apple II DOS 3.3
================================================

This project is documented in the source code comments. The 3 disk images
LS33.DSK, DIR33.DSK, and CHTYPE.DSK are configured to display their
respective sources when you run them.
 
What follows is a summary of additional information about working with DOS
3.3 files in a Windows cross-development environment, especially pertaining
to this particular project.

This Document
=============

This document is written from a single perspective. While it is presriptive,
you may already have your own way of dealing-with some or all of the
techniques discussed in this document. Regardless, although I recognize that
many alternatives exist and many others are more knowledgeable than I in
these matters, I am passing-on my opinions and notes for what they are
worth, with the hope that they will be useful.

Dedication
==========

I should apologize to Andy McFadden, the author of CiderPress, in advance
for the verbose and cavalier manner that I have herein documented my own use
of his excellent product which he has so generously made available to all of
us. Without Andy to follow this project would not have been possible. I can
only recommend that you participate in the cleansing experience of Andy's
tutorial (link listed in this document) to exorcise whatever stray demons
you may encounter by way of my explanations.

Bill Buckels
August 2013

Working With DOS 3.3 Files in Windows and on Disk Images 
========================================================

Essential Tools Required:
=========================

1. Programmer's Editor capable of saving plain text files in different
formats (i.e. "PC", "Apple", "Unix") like TextPad 4. 

Alternately you may use any text editor at all including Windows Notepad, or
a native mode DOS 3.3 editor like Aztec C's VED or a ProDOS text editor like
the one that came with ProTerm. Much of what you read here may be useful to
you anyway.

2. Apple II Disk Manager capable of maintaining DOS 3.3 Disk Images. I use
CiderPress. Even if you don't, much of what you read here may be useful to
you anyway. 

3. Apple II Emulator capable of running DOS 3.3 Disk Images. I use AppleWin.
Even if you don't, much of what you read here may be useful to you anyway.

In Windows (and other platforms), there are 3 distinct schools of thought
preference about tools for working with and running DOS 3.3 disks and
diskimages:

1. Apple II Native mode tools, or porting tools like a null modem cable and
modem software. 

2. Cross-platform (Java based) tools that work equivalently in Windows,
Linux, on the Mac, and other platforms.

3. Windows Native Mode tools that work only in Windows. These include older
Windows tools and MS-DOS tools which may or may not work on your version of
Windows or in emulators like DOSBox.

Over my time in software development including but not limited to the Apple
II, which extends from the '80's to now, I have used all of the above...
they all have their place and represent valid solutions and much fun. But
sitting here on my Windows machine today, I am writing from the Windows
native mode perspective, for what its worth. I could have written from any
of the perspectives above, but in the interest of my own sanity and
finishing this project I have scoped my view for all of this, but encourage
you to experience all of the above if you haven't already.

So without further ado, and from the prescriptive and parochial perspective
of Windows Native Mode tools and one old C programmer, I commend unto you
the following notes and may the ghosts of the ancient aztecs have mercy on
your code!

Making DOS 3.3 Disk Images with CiderPress for Use with Aztec C65
=================================================================

The real CiderPress tutorial is at the following link:

http://ciderpress.sourceforge.net/tutorial/

Actions Menu - Add Files Option - File Attribute Preservation
=============================================================

CiderPress offers a couple of different ways to add files to a DOS 3.3 disk:

1. With "file attribute preservation flags"
2. Without "file attribute preservation flags"
  
You do not need to turn-off (ignore) "file attribute preservation flags" to
add files without "flags" to a DOS 3.3 disk. But if for some reason you want
to add files with flags to a disk this must be turned-on to strip the flags
from the filename and to place the flagged files properly on the disk.

If you are extracting files (not adding files) Ciderpress gives you the
option to add flags to the filenames; a useful feature if you wish to
archive these on your Windows machine with filesystem information intact or
for adding to some other disk later.

"file attribute preservation flags" 
===================================

"file attribute preservation flags" is a file naming convention used by
CiderPress. To use a flagged file name, rename the file by appending a pound
sign followed by the 2 digit DOS 3.3 file type and a 4 digit load address,
both in hex, to the file's usual name:

README.TXT#040000 - A typical text file 
OV1.OVR#062ad2 - A binary file with a load address at 0x2ad2

Flagged file names tell CiderPress how to put a file on a DOS 3.3 (or
ProDOS) disk. Read the CiderPress help file for more information.

But you don't generally need to use them for hand-building disk images of
your own programs. It is also probably obvious that you need to know about
DOS 3.3 files before using these... read further.

Working without "file attribute preservation flags" 
===================================================

Use Ciderpress To Add Files to a DOS 3.3 disk image with the Add Files
command from the Actions Menu.

TXT and BIN files are the two types of DOS 3.3 file types supported by the
Aztec C65's runtime library functions that create new files. You can do FILE
I/O with all the other DOS 3.3 file types using the Aztec C runtime with the
exception of RATF files which are discussed later.

These two DOS 3.3 file types (TXT and BIN) are all you need to develop Aztec
C65 DOS 3.3 programs in the Aztec C Shell.

TXT - DOS 3.3 text files do not have a header. Sequential and random-access
text files share the same file type but Aztec C65 for DOS 3.3 expects
sequential text files only, and like ProDOS, the Aztec C Shell and other
Aztec C65 applications expect plain 7 bit Ascii text rather than DOS 3.3
text (hi-bits set).

1. Aztec C Plain Text 
=====================

When saving text files on an MS-DOS or Windows computer to use in the DOS
3.3 Aztec C Shell (7 bit Ascii), save these as MAC Ansi (eol carriage
returns only) and import these directly to a DOS 3.3 disk using CiderPress.
These will show-up as FileType F2 ("NON"). Set the FileType to TXT and you
will be good to go.

2. DOS 3.3 High Bits Set 
========================

2.1 For DOS 3.3 text files you can place these Mac Ansi 7 bit text files on
a ProDOS disk first and set the filetype to text, since this format is what
ProDOS uses for text.

Note: Aztec C65 programs that read DOS 3.3 text files may need to strip
hibits which is trivial. Writing DOS 3.3 text files in an Aztec C65 program
is trivial too (see the code in DIR33.c) but it is also explicit so the
programmer needs to be specific about such things as carriage returns making
sure that the ascii value is specified in his local code.

2.2 Then paste these from the ProDOS disk onto a DOS 3.3 disk within
CiderPress. CiderPress will translate these to DOS 3.3 native mode text with
hibits set. This pasting technique also works for translating random access
text files and other filetypes that have been created in ProDOS. Some
information on cross-referencing ProDOS filetypes with DOS 3.3 filetypes is
discussed further in this document.

BIN - 4 - byte header followed by binary data. The load address and the
length of the data to follow is in a 2 integer (4 byte) header at the start
of the file itself. The short (2 byte) integers are in motorola linear
format rather than MS-DOS's reversed msb-lsb intel format.

1. When saving binary files for use in DOS 3.3, if a header is appended to
the beginning of the file CiderPress will use it. Import these directly onto
a DOS 3.3 disk. These will show-up as FileType F2 ("NON"). Set the FileType
to BIN and you will be good to go. 

Keep in mind that "there is no direct equivalent to NON for DOS 3.3.
Instead, CiderPress used the DOS file type 'S'" and "because DOS 3.3 'S'
files don't have an explicit file length, the length is rounded off to 512
(two DOS sectors)" when CiderPress places these on a DOS 3.3 disk image.

CiderPress will also usually detect a BIN file without a header but you will
need to set your load address manually. If CiderPress does not properly
detect a BIN file and especially if you are at your wit's end trying to
figure-out why, you can follow one of the 2 methods below.

2. You can also use "file attribute preservation flags" (discussed above) to
create headers for headless binary files. This is an easy method to make
sure that CiderPress knows how to place a headless BIN file correctly onto a
DOS 3.3 disk. You will not need to manually set the Filetype from F2 to BIN
later because CiderPress will use your flags to do this for you. But perhaps
you do not want to add messy flags to your Windows filenames.

3. You can also put BIN files without headers on a ProDOS disk image first,
and set the filetype to BIN, and set the load address (Aux Type). Then use
Ciderpress to copy and paste from the ProDOS disk to a DOS 3.3 disk, and
CiderPress will take care of the rest for you.

However, remember if you add a Mac Ansi text file to a ProDOS disk, setting
the Aux Type to 0000 as you go, then paste onto a DOS 3.3 disk, you will
end-up with a DOS 3.3 text file with hi-bits set rather than the 7-bit Ascii
text that the Aztec C65 DOS 3.3 Shell expects. Aztec C's runtime will strip
the hibits from a DOS 3.3 text file when reading it so you can still read
these, but the shell cannot use them as scripts including as a .PROFILE.

File Naming must follow ProDOS conventions if you paste to a ProDOS disk
first. DOS 3.3 naming is longer (30 characters) and allows control
characters and such.

A ProDOS filename or volume name is up to 15 characters long. It may contain
capital letters (A-Z), digits (0-9), and periods (.), and it must begin with
a letter. Lowercase letters are automatically converted to uppercase.

http://www.easy68k.com/paulrsm/6502/PDOS8TRM.HTM

You can use CiderPress to rename files on a DOS 3.3 disk if you don't like a
ProDOS name. File Dates will be lost in a paste to DOS 3.3 since DOS 3.3
does not use dates in its filing system.

Working With BASIC Programs (in CiderPress)
===========================

If you need a HELLO program for a DOS 3.3 disk, or any DOS 3.3 BASIC program
for that matter, you can create the program in Windows as PC Ansi text and
use Ciderpress's "Import BAS from text" Action Menu command. To export a
BASIC program you can view the file by clicking on it, select the text with
your mouse, and copy it to the Windows clipboard using CTRL+V then paste it
into your editor and save as PC Ansi text, edit it as text, then "Import BAS
from text" using CiderPress to get it onto a DOS 3.3 or ProDOS disk. One
other thing 'though... remove your old BASIC program first... CiderPress may
decide to give your new one a different name if you don't. As noted earlier,
read the CiderPress help file for more info about using CiderPress.

There is at least one other program out there that provides Windows support
for BASIC programs on Apple II disk images. It is called Wasp. Wasp is used
in conjunction with the AppleWin emulator and worthy of mention. 

Of course you can always just "hand-bomb" these in DOS 3.3 BASIC, but if you
are a C programmer and you prefer to write in your Windows Editor, this
feature of CiderPress is probably for you.

Note: This feature of CiderPress is a little more straight-forward than when
I work with Commodore BASIC programs on my Windows machine. For editing an
existing BASIC program on the C64 I use the c1541 utility that comes with
the WinVICE C64 emulator to extract the tokenized BASIC file. Then I use
tok54 (another utility) to provide me with a text copy. After editing I use
tok64 to create a token file in C64 format from the text copy, and finally I
put the edited BASIC program back on the .d64 Commodore 64 disk image using
c1541. When I work with BASIC files using CiderPress it is considerably more
efficient for me. I know I am comparing Apples to Bananas here but think
this worthy of mention to keep things in perspective.

DOS 3.3 BASIC programs may not suit you, and the Aztec C65 Shell for DOS 3.3
has scripting which generally works well to augment C programs for the Shell
which are tiny like BASIC programs. So you really only need a DOS 3.3 BASIC
program for one thing and that is to launch the Aztec C Shell. In ProDOS we
don't have this same problem with the ProDOS shell since it is a system
program and can self launch without AppleSoft being necessary.

Filetype Notes - DOS 3.3 and the view from CiderPress and ProDOS
================================================================

DOS 3.3 supports 4 "main" types of files, and four "additional" file types, 
all identified by letters in a DOS 3.3 CATALOG listing:

DOS 3.3 Main Types (Catalog Type, CiderPress Codes, Description)
==================

DOS 3.1 only supported the following 4 Filetypes:

T - TXT - $04 - Text files, either DOS 3.3 text or plain text.
I - INT - $FA - Integer BASIC Programs - Tokenized to save space. 
A - BAS - $FC - AppleSoft BASIC Programs - Tokenized to save space. 
B - BIN - $06 - Binary files, either executable programs, or data files. 

DOS 3.3 Additional Types
========================

DOS 3.2 added the following 4 additional Filetypes (only 'R' was ever
officially designated by Apple):

S - ??? - $F2 - User type S - data files.
R - REL - $FE - Relocatable binary executable files.
A - ??? - $F3 - User type A+ 
B - ??? - $F4 - User type B+ - Lisa assembler source. 

The DOS 3.3 CATALOG listing does not differentiate between AppleSoft and
User type A+ or Binary and User type B+ files. It duplicates the same
letters (A or B) for the respective main type and user type, so a CATALOG
listing alone is confusing and arguably useless in these 2 cases. However
based on "the law of averages", a type A or B file in a catalog listing will
most likely be a main type. 

Note: When Apple moved from DOS 3.1 and added these additional types if they
had named user type A and B as user type 1 and 2 in the catalog listing
their separate identity would have been obvious. But it's history now!

ProDOS 8 File types and DOS 3.3 File types
==========================================

In the DOS 3.3 file type tables above, I refer to the filetype codes as
"CiderPress Codes" because when you look at a DOS 3.3 disk in CiderPress, it
uses ProDOS standard filetype values to describe the equivalent DOS 3.3 file
types. This is also useful for "file attribute preservation flags"
(discussed above) for consistency between ProDOS and DOS 3.3. It may also be
a useful view for the average user... but it is a little hard for me to
explain at the programmer level without a considerable amount of verbosity.

DOS 3.3 uses its own file type values which are not displayed on
Ciderpress's "menu".

ProDOS supports 255 possible file type values (not counting file type 0).
DOS 3.3 supports only 8.

ProDOS stores files as "unaddorned" data separated from file information.
DOS 3.3 uses different disk storage schemes to store different file types.
 
ProDOS 8 uses its volume directory and subdirectories (both which are also
files) to track its files (types, aux types, lengths and other info). The
DOS 3.3 filing system is arguably quite primitive by comparison, and does
not use volume directory files or subdirectory files.

ProDOS Pathing and DOS 3.3 Pathing
==================================

ProDOS 8 provides pathnames (prefixes) which depend on its volume directory
and subdirectories to find files on hard disk as well as floppy media. DOS
3.3 does not (in general terms) support hard disks, so arguably has no need
of pathing of this nature, although the Aztec C DOS 3.3 Shell's command line
supports "backwards path-naming"; pathing between two floppy drives through
its file naming convention of appending a drive pseudoname of ",d1" or ",d2"
to a filename.

A standard DOS 3.3 disk has a structure called a Volume Table of Contents
(VTOC) stored at track $11, sector $00. DOS 3.3's VTOC points to a chain of
sectors called a CATALOG which contains 35 byte "File Descriptive" entries
for each file on the disk. These entries provide the information for the DOS
3.3 CATALOG command. Each entry gives the file type, the file length in
sectors (not in bytes) and the file name (30 Characters).

ProDOS provides accurate information about file lengths. The ProDOS file
system is the same as the SOS file system; a subject too large for this
document, except to add that with ProDOS and the SOS file system, Apple
moved to a file information storage system that was no longer in a 256 byte
sector stored in a track on a floppy, but in 512 byte blocks stored as
regular files just like any other file, that could be easily opened as
files, and read as files, without the need to read directly from the disk
sector level. ProDOS therefore is portable to variety of media including
media that hasn't been invented yet. DOS 3.3's filing system is not
portable, and is locked like latin into the tracks and sectors of floppy
disks forever.

DOS 3.3 CATALOG Entry File Type Values - The REAL DOS 3.3 Filetypes
===================================================================

DOS 3.3 Main Types (Catalog Type, DOS 3.3 Entry Values, Description)
==================

T - TXT -  $00 - Text files, either DOS 3.3 text or plain text.
I - INT -  $01 - Integer BASIC Programs - Tokenized to save space. 
A - BAS -  $02 - AppleSoft BASIC Programs - Tokenized to save space. 
B - BIN -  $04 - Binary files, either executable programs, or data files. 


DOS 3.3 Additional Types
========================

S - ??? - $08 - User type S - data files.
R - REL - $10 - Relocatable binary executable files.
A - ??? - $20 - User type A+ 
B - ??? - $40 - User type B+ - Lisa assembler source. 

Note: If the High Bit in the DOS 3.3 Catalog Entry's File Type is set, the
file is locked. If it is not set, the file is unlocked. And that kids sums
up the one and only file system attribute of a DOS 3.3 file, besides the
filename, filetype, and number of sectors. The information in the DOS 3.3
CATALOG command is all there is, and the rest is part of the individual file
itself.

Aztec C65 DOS 3.3 File Types
============================

As stated above, TXT and BIN files are the two types of DOS 3.3 files
supported by the Aztec C65 runtime library functions that create files.
These two DOS 3.3 file types are all you need to develop Aztec C65 DOS 3.3
programs.

C65 DOS 3.3 Link-Library Functions for TXT and BIN Files
========================================================

Reading and writing of text files is generally (but not always) done using
buffered I/O functions like fopen(), and reading and writing of binary files
is generally (but not always) done using raw I/O functions like open().

Internal file format details:
=============================

Some of this is a rehash of info already previously covered but bear with
me... you are ready for a little more now!

SIDs
====

Sectors In Disguise (SIDs) (as files) is probably the best way to view the
DOS 3.3 filing system. When it comes to DOS 3.3 files, you are just one step
above the floppy media that it runs on.

DOS 3.3 programmers have been free to store whatever they wanted in any file
type, just as they have been free to use their own DOS 3.3 disk formats that
still work but are "customized" to suit one purpose or another. The DOS 3.3
internal file formats below are described generally and as such, the
suggested methods for dealing with these formats generally work within that
caveat.

1. TXT - Text files do not have a header. Sequential and random-access text
files share the same file type. It is up to individual programs to know
whether a text file is sequential or random-access. The discussion of text
files in this document generally pertains to sequential text files, whether
they are in plain text (7 bit ascii) or DOS 3.3 text (high bits set) but I
will also cover a little about Random Access Text Files (RATF's) as we go...

1.1 Sequential Text Files 
==========================

The first NUL character stored in a text file is the end of the file. To
determine the actual length of a DOS 3.3 sequential text file, the count of
bytes read until the first NUL character is reached can be used to determine
the actual file length.

Random Access Text Files (RATF's)
========================

Flat Files in Disguise (FFID's) (as text files):
-----------------------------------------------

A chunk of possibly NUL bytes possibly mixed with DOS 3.3 text strings
terminated with carriage returns inserted at random allocated by sectors
with no additional file length information.

That description, while somewhat unfair considering the accomplishments of
the day by Apple Computer, is accurate when it comes to RATF's. But DOS 3.3
records all files only by sectors rather than file length in bytes, so it is
up to the programmer to figure-out how long a file is and what is in the
file for each and every file.

RATF Record Length and Number of Records
========================================

In the ProDOS filing system the record length for these is stored in the
file's Auxiliary type (a record length of 0 indicates a sequential text
file), but DOS 3.3 has no Auxiliary type, nor any provision for an Auxiliary
type or file length in a header since text files have no headers. In the
ProDOS filing system the file length is stored in the applicable volume
directory or subdirectory.

Unless you know the record length used by a random access file, and the
number of records that have been written to it you are guessing, and you
certainly cannot safely edit these. Small wonder that Aztec C65 chose only
to provide support for sequential text files and not for RATF's.

Note: One technique that can be used to determine the approximate file
offset of the last active record in a DOS 3.3 RATF (by active record I mean
a record with at least one string in it); a carriage return (Ascii 13) with
hi-bits set (Ascii 141 - hex 8D) will be followed by a NUL byte (Ascii 0). I
am told that you can't depend on that last NUL byte either but my testing in
the release of DOS 3.3 that I am using tells me otherwise.

Editing RATF's in Aztec C for DOS 3.3 
=====================================

Note: I provide 3 utilities with this release to help you with this:

CHTYPE - Shell Command to Change DOS 3.3 FileTypes.
RAT -    Random Analysis Tool (Shell Command) which also can be used to
	 export RATF's to Comma Separated Value (CSV) files.
RD -     Random Dump (Hex Viewer) with support for several options including
	 viewing the contents of RATF's in a flexible display.

These utilities have their own documentation and commented source code, and
demo disks, and should be reviewed for additional information if they are of
interest to you.

The Only Way
============

The only way you can edit a RATF in Aztec C in DOS 3.3 is to change the
filetype from TXT to some other DOS 3.3 File Type (user type 'S') using RWTS
(Read Write Track Sector), edit it as a type 'S' file, then change the
edited file from back to TXT again when done, by using RWTS. Maybe storing
these as a user type 'S' would be a better final target filetype, unless you
still want to access these in BASIC. 

BASIC only opens text files (which is the reason for RATF's in the first
place) and will report a FILE TYPE MISMATCH error if you try to open some
other DOS 3.3 file type. The rationale being that BIN files in DOS 3.3 are
expected to be used as BLOADable data and BRUNable programs, and the INT and
BAS types are interpreted programs; all belonging to the DOS 3.3 operating
system which is BASIC itself. The REL type is relocatable object code is
marked as such forever and is just as unextensible in its format as an INT
or a BAS, with the obvious drawback that Apple Computer wasted a perfectly
good filetype with only 3 more to go. Type 'S' was never defined; it was
"dog-piled" upon and so was ensured by its users to have no agreed upon
format. I have searched in vain for the meaning of 'S'; does it stand for
"Supplementary", "Secondary, "Shared", "Something else", or "Sector"... I
like "Sector(s)" because that is all it is.

If you still need to share these with BASIC you could also consider porting
DOS 3.3 RATF's to ProDOS, where you have file length and record length info
available from the filing system, and clear text to work with, and you could
flip the filetype during your edits and back again when done, the same as
DOS 3.3. But it is somewhat safer to use the SETFINFO call in ProDOS to
change the file type than to be mucking with the catalog sector in DOS 3.3.

You could also use CiderPress to extract a RATF to your Windows drive, maybe
pasting onto a ProDOS disk first to change the text to 7 bit Ascii, then
dump the text to a CSV file from a program written for Windows, and import
the records into Microsoft Access or some other real database program.

See note above about shell utilities for helping you with that.

Guessing RATF's Records - big risk 
=======================

If part of a larger RATF set depends on record positions not changing you
could mess something-up very badly; i.e.  if you wanted to pack the file, or
sort the file. If the RATF contains deleted record marks of some kind you
would need to know this too if you wanted to either pack the file or reuse
the deleted record.

If you are lucky enough to determine the last logical record in the file,
and lucky enough to determine which records are active, and lucky enough to
have no dependencies on record positions (like external indices) stored in
other RATF's (or in key files of whatever nature), then you could append
past the last logical record with new records, and you could use a marking
system (similar to DOS 3.3 catalog entries) to mark deleted records for
reuse, and you could change existing data too.

Guessing RATF's Fields - bigger risk
======================

But you would also need to know field offsets. RATF's have no rules about
putting fields into a RATF and BASIC uses character offsets to do this
instead which are not recorded in the DOS 3.3 filing system (or the ProDOS
filing sustem either). And a BASIC program can put any DOS 3.3 text at
RANDOM into these, and even overflow into subsequent records, skip records,
etc. Records within the file can contain NUL bytes and may even be entirely
empty or padded with whitespace, so you can't even depend on an NUL byte
record terminator.

Guessing RATF's Right - the optimistic side
=====================

In Jeff (Rubywand) Hurlburt's DOS Mini-Manual - csa2 version:

http://apple2.org.za/gswv/a2zine/faqs/Csa2DOSMM.html

He is more charitable than I am when he says a random-access text file
(RATF) may be thought of as a set of "mini sequential access files"
separated by strings of NUL characters padding-out to a fixed record length.

Assuming for the moment you get a RATF that is nicely filled
with data... mini sequential access files that stay within their own
records. If you want to decipher the thing to determine record length and
field positions, you will need to view the file in a hex viewer that strips
the hibits from DOS 3.3 text to determine some commonality between record
boundaries and also between fields in each of the records. If the first
field is a first name, and the second field is a last name, this will be
easy. So let's assume that every record is uniformly and properly formed
into logical fields.

You can use Ciderpress's built-in hex viewer to analyze a RATF. That's
pretty easy. And you can do all this other stuff too. 

Or you can use the analysis and dump tools and hex viewer that I have
provided for you with this release and have some fun. But that's enough
about RATF's. It's time to look at the other internal file formats.

DOS 3.3 Internal File Formats Other Than Text Files
===================================================

One little, 2 little, 3 little endians...

2. INT and BAS - 2 - byte header followed by tokenized data - The length of
the tokenized data to follow (effective file length) is in a 1 integer (2
byte, 1 little endian) header at the start of the file itself.

3. BIN - 4 - byte header followed by binary data. The load address and the
length of the data to follow is in a 2 integer (4 byte, 2 little endians) header
at the start of the file itself.

4. REL - 6 - byte header followed by data. The original load address, the
length of all the data to follow (program image + relocation dictionary),
and the length of the program image is in a 3 integer (6 byte, 3 little
endians) header at the start of the file itself. A Relocatable file contains
the image of a program, followed by a  relocation dictionary containing the
information necessary to relocate the  program to an arbitrary memory
location.

Note: This description of the DOS 3.3 REL file format is based on the csa2
Apple II FAQ. The file format described here does not match the Aztec C REL
object format used by the this version of the C65 compiler.

5. The other three file types (S, A+, and B+) have never been consistantly
defined by anybody.  Several programs use these file types (especially type
S) to store their private data files, but there doesn't seem to be any
agreement on their internal format. Or so the story goes.

Determining File Length of DOS 3.3 Files
=========================================

Sequential Text File Example
============================

As noted above, you can determine sequential text file length by reading
through a sequential text file using buffered I/O and counting the bytes
until Aztec C hits a NUL byte and that is the end of the sequential text
file... Aztec C does this for you and also note in the example below that
fgets() returns an integer (and not a char pointer) in this old compiler
unlike in Ansi C:

#include <stdio.h>
FILE *fp = NULL;
int fl,c;
char buf[512];
fp = fopen(fname,"r");
if (NULL != fp) {
  fl = 0;
  while (fgets(buf,512,fp)!=0) {
    c = strlen(buf);
    if (c < 1) break;
    fl += c;
  }
  fclose(fp);
  printf("File Length = %u bytes.\n",fl);
}

You cannot do this with a random access text file. Aztec C65 cannot read
RATF's properly (if at all) even using unbuffered I/O without changing
filetype first (see preceding notes).

All Files Example - and additional file info notes:
=================

Files like BIN files that store the length of their data in a header are the
easiest to determine file length. That is, if the length information fields
in the files have not been "adjusted" and used for some other reason. The
native version of Aztec C65 from did exactly that with their libraries that
you have here. So you can't always depend on the length field, even in a BIN
file.

The following works in Aztec C with with all files (except random access
text files). However, it reads full sectors on every file except a text
file, so you can read past the end of the data in a BIN file and you will
still need the length info in the header if available for cross-checking. 

showlen(fname)
char *fname;
{
int fh, len = -1, c;

fh = open(fname,0);
if (fh > -1) {
  for (len=0;;) {
    c = read(fh,buf,256);
    if (c < 1) break;
    len += c;
    if (c < 256) break;
  }
  close(fh);
  /* report actual number of bytes read */
  printf("%d bytes read from %s\n",len,fname);
  }
  return len;
}

Verification of Byte Length of Files in DOS 3.3
===============================================

You can use the unbuffered method shown above to read the proper length 
of a sequential text file.  

Reading the length from the header of BASIC files, BIN files, and REL files
is easy if the file is not damaged. Then while reading the rest of the file,
this length can be confirmed by checking the number of bytes returned by the
read() function as shown in the example above.

But there is no way to accurately determine the file length in bytes for the
other DOS 3.3 file types even by reading the CATALOG sectors.

Using fseek() and lseek() in DOS 3.3
====================================

In the example above, I have used a read() function to determine a file's
length. But with the exception of text files, you already have the length
field in the header of your Main DOS 3.3 file types to count on for
something resembling a file length integrity check so you can use lseek() or
fseek() to find a position in those files.

fseek() and lseek() do not work the same way in Aztec C65 for DOS 3.3 as
they do on today's systems. You can only use them to seek from a position
relative to the beginning of a file or from the current position. Similarly
append mode is not supported by open() and fopen().

By now the reason for this is probably obvious. As I said earlier, DOS 3.3
files are Sectors in Disguise as files. 

Seeking a RATF
==============

I have categorically stated that the only way to read a RATF in a Aztec C
Program is to change its file type to some type other than text first and I
have suggested that the type 'S' Sector file be used for this purpose since
its use has seen many a mystery for many decades now and it has withstood
the test of time.

You can seek to the beginning of each logical field in a RATF
using either lseek() or fseek() and read data if it's there and nothing if
it's not. 

Please feel free to try this yourself if you are so inclined. I recommend
reading the doucmentation and the source and the trying demos provided as
noted above before you begin mucking with your own data and please work on a
copy.

Files in Disguise (FID's) (as other files):
=================

Beware Will Robinson! There is danger ahead!

There are no guarantees of file integrity in DOS 3.3. As I said previously,
and others said before me, DOS 3.3 programmers have been free to store
whatever they wanted in any file type.

An example of this is Aztec C's library file format. These are listed in the
DOS 3.3 file system as binary files, but instead of the usual binary header,
they contain the same library header as their MS-DOS and CP/M counterparts
of the same era. 

This was a matter of convenience on the part of Aztec C's developers. DOS
3.3 doesn't have any control over the header info that a programmer sticks
in his files... it just merrily allocates sectors and the programmer does
the rest. Aztec C's runtime library can open and write to and read from any
file type (with the exception of RATF's) but Aztec C65 only creates files of
2 types: BIN and TXT.

You can of course create a file of type BIN for any other type and control
the writing of data to it since you can write anything to a BIN file in
Aztec C65. Then after you close the file, you can change the type to
whatever you wish (of the 8 types available of course). Aztec C's developers
saw no need to write special routines to do so with library files. After
all, DOS 3.3 takes care of the sectors, and the Aztec C librarian program
(MKLIB) deals with these exclusively (presumably), and the libraries are
distributed with locks set, so what does it matter?

I apologize if I shattered your vision of some higher order existing at this
early stage of Apple's filing systems. Since before the days of Troy, and
Aesop's tale of wolves wearing sheepskin jackets, things have not always
been as they seem, so as I also promised at the beginning, I have described
the DOS 3.3 internal file formats generally and the suggested methods for
dealing with these formats generally work within that caveat.

But wait, there's more! Our epistle will end today with some additional and
partially rehashed notes about sequential text files in the Aztec C65
environment.

Text Files in the Aztec C65 Shell for DOS 3.3
=============================================

Native DOS 3.3 Text Files - High Bit Set
========================================

Native DOS 3.3 text files store 7 bit text with the high bit set;
effectively a raw text file on a dos 3.3 disk is in high ascii (8 bit ascii)
but really only stores 7 bit ascii values. In order to read the raw format
from a DOS 3.3 text file, to obtain 7 bit clear text, you need to strip the
high bits.

i.e. AppleChar & 0x7f = LowAsciiChar

Aztec C65 Text Files - Plain Text
=================================

The Aztec C Shell for DOS 3.3 (and Aztec C65 itself) creates low ascii text
files (7 bit ascii) and expects low ascii text files as shell scripts. This
includes the .PROFILE shell script that runs when the shell starts. 

The DOS 3.3 Shell's cat Command - Plain Text Display of All Text Files
======================================================================

The DOS 3.3 shell's built-in cat command which lists text files to the
screen does not care whether it is listing a native DOS 3.3 high ascii text
file or an Aztec C low ascii text file to the screen; it strips the high
bits from the DOS 3.3 text file so you can't tell the difference when
viewing the file using cat... it lists either type of file equivalently.

End of Line Character - Ascii 13 (carriage return)
==================================================

Both types of text files effectively use the Apple Text eol (end of line)
convention of a carriage return (Ascii 13) in their raw format, with the
caveat that the raw DOS 3.3 text file is high ascii so the high bit is set
even on carriage returns.

This simple bit of knowledge can save you much noodle-scratching especially
when preparing disks with baggage files like shell scripts and DOS 3.3 text
files.

Best Regards,

Bill Buckels
bbuckels@mts.net

August 11, 2013

End of Document

